\section{Server}
\label{serverimpl}

This section will analyse how well the server implementation fulfils the design outlined in section \ref{ServerDesign}. In general, the implementation of the server was a straight-forward affair with very few problems encountered during its development. The most challenging part of the server's development related to concurrency, and is discussed in section \ref{concur}. Spelling errors and typos were the most common cause of problems, however given their easy-to-fix nature, they are not discussed.

The server was coded relatively quickly, with the majority of functionality complete before any serious progress was made with the client. This worked particularly well as it meant the client was not hindered by the lack of server support and its core networking components could be developed very quickly. It also meant that the server could be thoroughly tested before being used by the client. This made debugging the client easier as it was clear which bugs were caused by the client, and which bugs were caused by the server.

\subsection{Concurrency}
\label{concur}
Concurrency was the biggest concern while implementing the server and it was very important that we got it right. Concurrency problems such as race conditions are generally considered to be one of the most difficult problems to debug, so great care was taken to ensure that the possibility of these problems was kept as low as possible. Although we discovered some problems with threading (discussed in section \ref{server_eval}), none of the bugs were caused by race conditions. Instead, confusion about how threads work conceptually caused most of the threading related bugs. For example, in one case an attempt to kill another thread actually caused the current thread to commit suicide. The following describes how thread safety was dealt with and the lessons we learned.

Originally the server used the HashMap class from the \texttt{java.util} package very extensively, primarily because of its very quick look-up times and ability to use user IDs (Strings) as keys. Each User has five HashMaps to store data: their friends, which users have them as a friend, friend requests, blocked users, and rooms they are currently in. Each room uses one to store current users and another to store invited users. The global Data class uses another three to store all of the rooms, users, and workers on the server. This was an issue because HashMaps are not thread-safe, which means each of them had to be wrapped in a synchronised block to ensure thread safety. This was very tedious and very prone to human error. Later on in the development of the server we discovered the \texttt{java.util.concurrent} package, which contains thread-safe implementations of some of the classes in the \texttt{java.util} package, including a thread-safe version of the HashMap class called ConcurrentHashMap. By using the ConcurrentHashMap, it allowed us to remove a lot of the boilerplate code used to make the original implementation thread-safe, and made working with the HashMaps much safer and less prone to human error.

The Buffer class implements a thread-safe, unbounded blocking queue. Essentially the buffer is a LinkedList made thread-safe by only allowing items to be added and removed through synchronised methods. This ensured that at no point could more than one operation be performed on the list, preventing race conditions from occurring.

One of the most difficult bugs to find occurred when a user attempted to log in from two different clients. The server located the worker which was allocated to the client already logged in, placed an \texttt{:ERROR:} and a \texttt{:QUIT:} command into their buffer, and then set the user's worker to the worker of the new client. The new client then mysteriously disconnected. We eventually realised that by putting the \texttt{:QUIT:} command into the worker's buffer, it was not immediately being disconnected, and there would be a slight delay between placing the command into buffer and the command being executed. The \texttt{:QUIT:} command logs out the user and then kills their client. As we had already set the user's worker to the one associated with the new client we were effectively killing both the old and new workers. This was something we had not considered when designing the server. 

Fortunately the solution was fairly simple and did not require any sizeable changes to the structure of the server. We forced the new worker to wait until the old worker had died, and therefore had emptied its buffer and logged the user out, meaning that it was now safe to assign the new worker to the user and log them in.

\subsection{Detecting Abuse and Enforcing Limits}
Originally the server did not enforce any of the limits defined in the protocol and these had to be introduced later on in development. As we were aware from the start of development that they would need to be implemented in the future, we were able to design the server so that they could easily be added. 

Limiting the number of commands in a moving window turned out to be an interesting problem to solve efficiently. We could not simply count the number of commands and reset it after 5 seconds as this would not be a rolling window. We realised that we would have to record the times of the last 100 commands, and compare the time of the most recent command against the time of the oldest command. To make this as efficient as possible we store the list of times in an array of size 100, looping round the array, checking and updating the time in each index sequentially. If we reach an index where the time difference is less than 5000 milliseconds ago then we know that we have executed over 100 commands in 5 seconds.

The following code describes the algorithm used:

\begin{verbatim}
this.lastCommandTime = System.currentTimeMillis();
long oldest = this.commandTimes[this.last];

if (oldest != 0 && oldest > (this.lastCommandTime - 5000))
    killWorker();

this.commandTimes[last] = this.lastCommandTime;
this.last = (this.last + 1) % (this.commandTimes.length - 1);
\end{verbatim}

Because this must record the time of the last command it can also be used to help identify workers which have timed out. As the Timeout class does not need to be particularly accurate, it is only run once every 30 seconds and sleeps until it is next scheduled. The Timeout class works with the workers and relies on them to keep the times updated. It loops through every worker on the system, checking the time of their last command against the current time, and if it finds that the difference is greater than 15000 milliseconds then it kills the worker.
